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The problem of deciding whether one point in a program is data dependent upon another is 

fundamental to program analysis and has been widely studied. In this paper we consider this 

problem at the abstraction level of program schemas in which computations occur in the Herbrand 

domain of terms and predicate symbols, which represent arbitrary predicate functions, are allowed. 

^ ■ Given a vertex I in the flowchart of a schema S having only equality (variable copying) assignments, 

and variables v,w, we show that it is PSPACE-hard to decide whether there exists an execution 

of a program defined by S in which v holds the initial value of w at at least one occurrence of 

I on the path of execution, with membership in PSPACE holding provided there is a constant 

^. ■ upper bound on the arity of any predicate in S. We also consider the 'dual' problem in which 

ryT) ' i> is required to hold the initial value of w at every occurrence of I, for which the analogous 

i^^ ' results hold. Additionally, the former problem for programs with non-deterministic branching (in 

/-y») , effect, free schemas) in which assignments with functions are allowed is proved to be polynomial- 

r^\ time decidable provided a constant upper bound is placed upon the number of occurrences of 

• ■ the concurrency operator in the schemas being considered. This result is promising since many 

concurrent systems have a relatively small number of threads (concurrent processes), especially 

(a_^ _ when compared with the number of statements they have. 

Categories and Subject Descriptors: D.3.3 [Programming Languages]: Language Constructs 
— control structures; D.3.4 [Programming Languages]: Processors — compilers, optimisation 
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u:=h{); 

if p(u) then v:=f(u); 
else v:=g(); 

Fig. 1. Schema S 

1. INTRODUCTION 

A schema represents the statement structure of a program by replacing real func- 
tions and predicates by symbols representing them. A schema, S, thus defines a 
whole class of programs which all have the same structure. Each program can be 
obtained from S via a domain D and an interpretation i which defines a function 
p . jjn _^ jj £ or eac h f unc tion symbol / of arity n, and a predicate function 
pi . jjm _^ j-p^ p| f or eacn predicate symbol p of arity m. As an example, Figure 
1 gives a schema S, and the program P of Figure 2 is defined from S by interpret- 
ing the function symbols /, g, h and the predicate symbol p as given by P, with 
D being the set of integers. The subject of schema theory is connected with that 
of program transformation and was originally motivated by the wish to compile 
programs effectively [Greibach 1975]. Many results on schema equivalence [Danicic 
et al. 2007; Laurence et al. 2004; 2003; Sabelfeld 1990; Luckham et al. 1970] and 
on applying schema formulation to program slicing [Laurence 2005; Danicic et al. 
2005] have been published. 

In this paper we are concerned with establishing complexity bounds for data de- 
pendence problems defined on schemas. We only consider schema interpretations 
over the Herbrand domain of terms in the variables and function symbols. We con- 
sider the problem of deciding the following two properties, defined using a schema 
S, a variable v, a variable or function symbol / and a vertex I in the flowchart of 
S. 

— (Existential data dependence.) If there is an executable path through 5* that 
ends at I at which point the term defined by v contains the symbol /, then 
3DDs(f 7 v 7 l) is said to hold. 

— (Universal data dependence.) If, for all executable paths through S, the term 
defined by v contains the symbol / whenever I is reached, then ^DDs{f, v, I) is 
said to hold. 

If S belongs to the class of schemas in which all assignments are equality assign- 
ments (that is, assignments of the form v.— w; in which the value held by a variable 
w is copied to w), we prove the following. 

— The problems defined by these properties are both PSPACE-hard, even when 5* 
is further required to belong to the class of schemas in which no concurrency 



m:=1; 

if u > 1 then v:=u + l; 
else v:=2; 



Fig. 2. Program P 
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x :=/(); 

while p(z) x:=g(); 

I: y-=x; 

Fig. 3. A schema demonstrating the greater precision obtainable by considering data dependence 
problems defined on program schemas compared with non-deterministic programs. 

constructs are allowed and only two while loops are permitted in S, one of which 
lies in the body of the other, and no predicate symbol occurs more than once. 

— If S is required to contain no loops or concurrency constructs, and each of its pred- 
icate symbols has zero arity, then 3DDs(f, v, I) is NP-hard, and VDDs(f, v, I) is 
co-NP-hard. 

— Both problems lie in PSPACE provided there is a constant upper bound on the 
arity of any predicate in S. 

Additionally, we consider the existential data dependence problem in the case 
where assignments having function symbols are allowed, but where all schemas are 
free (that is, all paths are executable) and hence all branching is, in effect, non- 
deterministic. One possible application of data dependence on a function symbol 
/ would be in the case where / corresponds to a call to a function or method that 
we are altering; we might then want to decide whether this change can propagate 
through to the value of a particular variable at a particular point. For the class of 
free schemas, we prove the following. 

— Deciding existential data dependence is shown to be PSPACE-complete, owing 
to a reduction from the finite intersection problem for deterministic finite state 
automata. 

— Under the further condition that a constant upper bound is placed upon the num- 
ber of occurrences of the concurrency operator in the schemas being considered, 
existential data dependence then becomes decidable in polynomial time. 

To the authors' knowledge, neither problem has been previously considered for ar- 
bitrary schemas. Both problems have been studied for programs of various types. In 
[Muller-Olm and Scidl 2001], it is proved that deciding existential data dependence 
(expressed in the paper as a slicing problem) is PSPACE-complete for programs 
having concurrency constructs, but only non-deterministic branching. Muller-Olm 
ct al. have also considered a generalisation of our universal data dependence prob- 
lem [2005a; 2005b], defined by testing for equality between two terms at partic- 
ular program points, but their programs use term inequality guards on edges in 
flowcharts, and apart from this restriction, their programs are non-deterministic. 
In [Miillcr-Olm and Riithing 2001], an extensive classification of the complexity of 
deciding both our problems is given, but branching is non-deterministic and the 
domain is that of the integers in every case. 

Schemas represent a significantly closer approximation to real-life programs than 
purely non-deterministic programs, even when these are very simple. To demon- 
strate this, consider the schema S in Figure 3, in which x, y and z are distinct 
variables. 

Clearly 3DDs{g, y, I) docs not hold, since execution cannot enter the while loop 
in S and subsequently leave it, whereas if the while loop is replaced by the line 
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loop x :— <?(); to give a non-deterministic schema T, then 3DDx{g, y, I) holds. This 
example motivates extending the study of data dependence problems to schemas, 
since the gain in precision may be considerable. Another justification for considering 
program schemas is given by the fact that they have precisely the same level of 
abstraction as is usually assumed in program slicing. 

As an example of the use of establishing universal data dependence, consider a 
program which calculates the cost of a purchase - we would expect the overall price 
to depend always on the costs and amounts of the item(s) purchased. If this fails, 
then the program clearly contains a fault. 

The complexity results for existential dependence are more promising than they 
might initially appear. This is because many concurrent systems have only rela- 
tively few threads even if they are quite large (in terms of lines of code). The results 
also suggest that it should be easier to 'scale' data dependence algorithms to large 
programs/schemas with only a few threads than to smaller programs/schemas with 
many threads. For schemas and programs that might not be free, data dependence 
calculated on the assumption that freeness holds provides a conservative abstrac- 
tion of the actual data dependence. As a result, if existential data dependence does 
not hold under the freeness assumption then we know it does not hold even if the 
program or schema under consideration is not free. This is important in areas such 
as security where we wish to show that the value of one variable x, whose value is 
accessible, cannot depend on the value of another variable y whose value should be 
kept secret. 

2. BASIC DEFINITIONS FOR SCHEMAS 

Throughout this paper, T , V , V and C denote fixed infinite sets of junction symbols, 
predicate symbols, variables and labels respectively. We assume a function 

arity :JUP->N. 

The arity of a symbol x is the number of arguments referenced by x. Note that in 
the case when the arity of a function symbol g is zero, g may be thought of as a 
constant. 

Definition 2.1 schemas. We define the set of all schemas recursively as follows. 
I : skip is a schema. An assignment I : y := /(x); where y G V, / G T , I G C and 
x is a vector of arity(f) variables, is a schema. Similarly an equality assignment 
I : y:=x; for y, x G V is a schema. From these all schemas may be 'built up' from 
the following constructs on schemas. 

sequences. S' = U\ Uy, . . . U r is a schema provided that each U~i for i G {1, . . . , r} 
is a schema. 

if schemas. S" = I : if p(x) then T\ else Ti is a schema whenever p G V , I G C, x 
is a vector of arity (p) variables, and T\,T2 are schemas. 

non- deterministic branches. S" = I : T\ UT2 . . . UT m is a schema whenever / G C 
and T\, . . . T m are schemas. 

while schemas. S"' = I : while q(y)T is a schema whenever q G V , I G C, y is a 
vector of arity {q) variables, and T is a schema. 

non- deterministic loops. S'" = I : loop T is a schema if / G C and T is a schema. 
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concurrent schemas. S"" = I : Ti\\Tz . . . \\T m is a schema, where Ti, . . .T m are 
schemas. 

We only consider schemas without repeated labels; for example, in the case of 
the 'while' schema Z : while q(y)T, we assume that the label / does not occur in the 
recursive definition of T. 

The semantics of schemas are defined by their flowcharts, which are finite directed 
graphs. A directed graph G is a pair (V, E) with E C V x V. We define V = 
Vertices(G), the set of vertices of G. 

Definition 2.2. Given a schema S, we define a finite directed graph Flowchart (S) 
with an edge labelling function edgeType s that associates to each edge of Flowchart (S) 
either e, a triple (p, x, X) for a predicate p, a vector x of variables and X G {T, F}, 
or an assignment, as follows. Unless otherwise stated below, edgeType s maps to e. 

(1) If S is I : skip or I : y:=/(x); or I : y := x; then Flowchart(S) has vertex set 
{start, I, end} and edges (start, I) and (I, end). Here edgeType s (I, end) = e, 
y := /(x); or y := x;, respectively. 

(2) If S = S1S2, then Flowchart (S) has vertex set 

Vertices(Flowchart(Si)) U Vertices (Flowchart (S2)) 

and contains every edge occurring in either Si or S2, with the function edgeType s 
returning the same value as in Si or S2 respectively, except that Flowchart (S) 
does not have any edge (I, end) for a vertex I in Si or (start, I) for a vertex 
I in S*2. Instead, it has an edge (hjh) for each pair of edges (Zi,end) and 
(start, I2) in Flowchart(Si) and Flowchart(S2) respectively, with the function 
edgeType s {h,h) = edgeType Sl (h, end). 

(3) If 5 = / : Si U S*2 . . • U S m , then Flowchart (S) has vertex set 

Vertices (Flow chart (Si)) U . . . U Vertices (Flow chart (S m )) U {/} 

and contains all edges (V ',/") lying in any Flowchart(Su) such that Z' 7^ start, 
with the function edgeType s returning the same value as edgeType s in the 
appropriate Flowchart (Sk), and also contains an edge (1,1") for each edge 
(start, I") in any Flowchart (Sk)- Additionally, Flowchart(S) contains the edge 
(start, I). 
(3') If S = I : if p(x) then Si else S2, then Flowchart(S) is identical to Flowchart(l : 
S1US2) except that the edges (I, I") for each edge (start, I") in either Flowchart(Si) 
or Flowchart (S2) are mapped by edgeType s to (p, x, T) or (p, x, F) respectively. 

(4) If S = I : while q(y)T, then Flowchart(S) has vertex set Vertices(T) U {Z} 
and contains all edges (I', I") lying in Flowchart(T) such that Z' 7^ start and 
Z" 7^ end, with the functions edgeType s returning the same value as edgeType T , 
and also contains an edge (1,1") for each edge (start, I") in Flowchart(T), 
with edgeType s (l, I") = (q,y,T), and an edge (I", I) for each edge (Z",end) 
in Flowchart(T), with edgeType s (I" , I) = edgeType T (l" , end). Additionally, 
Flowchart(S) contains the edges (start, Z) and (Z, end), with edgeType s (l , end) = 

(?,y,F). 

(4') If S = I : loop T, then Flowchart(S) is identical to Flowchart( while q(y)T), 
except that edges with Z as initial vertex map to e under edgeType s . 
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(5) If S = I : S , i||5 2 || . . . \\S m , then Flowchart(S) has vertex set 

X-iLi(Vertices (Flow chart (Si)) U {start, I, end}, 

and given any r < m and vertices U G Vertices (Flowchart (Si)) for alH ^ r and 
any edge (V ' ,1") in Flowchart (S r ), the graph Flowchart(S) has an edge 

whose image under edgeType s is equal to edgeType s (I', I"). Additionally, 
Flowchart(S) contains the edges (start, I), (I, (start, . . . , start)) and 
((end, . . . , end), end). 

2.1 Semantics of schemas 

The symbols upon which schemas are built are given meaning by defining the 
notions of a state and of an interpretation. It will be assumed that variables take 
values in the set of terms built from the sets of variables and function symbols. 
This set, which wc denote by Term(T , V), is usually called the Hcrbrand domain. 
It is formally defined as follows: 

— each variable is a term, 

— if / G J- is of arity n and t\ 1 . . . , t n are terms then f(t\, . . . , t n ) is a term. 

The function symbols represent the 'natural' functions with respect to the set of 
terms; that is, each function symbol / defines the function (ti, . . . , t n ) i-> f(ti, ■ . ■ , t n ) 
for all n-tuples of terms (ti, . . . ,t n )- A state is a function from V into the set 
of terms. An interpretation i defines, for each predicate symbol p G V of ar- 
ity to, a function p % : Term(J 7 ,V) m — > {T, F}. We define the natural state 
e : V -4 Term(T, V) by e(v) = v for all »eV. 

Definition 2.3 state associated with a path through Flowchart (S) for schema S. 
Given a state d, a schema S and a path v through Flowchart (S) whose first element 
is start, we define the state A'll^]^ recursively as follows. 

— .MJstart] Jv) = d(v) for all variables v. 

— If v = [ill 1 for vertices 1,1' in Flowchart(S) and edgeType(l,l') is not an assign- 
ment, then .A/%] d = M{^l\ d . 

— If v = nil' for /,/' G Labels(S) and S and edgeType(l,V) = y:= f(xi, . . . , x n );, 
then 7V4[^] d (z) = AA\iil\ d (z) for all variables z ^ y, and 

M[v\ d {y) = f(MMd(xi), ■ ■ -,M\v(\ d (x n )), 
and the case of equality assignments is treated analogously. 

Definition 2.4 executable paths and free schemas. Given a schema S and an in- 
terpretation i and a path v through Flowchart(S) whose first element is start, we 
say that v is compatible with i if given any prefix /ill' oii> such that edgeType s (l, I') — 
(p,x\, . . . ,x n ,X), p l (Ml^l] e (xi), . . . , M\[il\ d (x n )) = X holds. A path whose first 
element is start is said to be executable if there exists an interpretation with which 
it is compatible. A schema is said to be free if every path whose first element is 
start is executable. 
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Since a schema S may contain the non-deterministic loop , U and || constructions, 
an initial state d and an interpretation i need not define a unique executable path 
in Flowchart (S) from start to end. In the event that only one executable path 
exists, we denote it by ns(i, d), and write A^JS 1 ]^ to denote the state M\irs{h d)\ l d . 
If S is merely a sequence of assignments, so that the interpretation i is irrelevant, 
then we simply write .M[iS'] d . 

2.2 The data dependence problems 

We now formalise the two data dependence conditions with which we are concerned 
in this paper. 

Definition 2.5. Let S be a schema and let v G V, let I G Vertices(Flowchart(S)) 
and let / G T U V. The predicate 3DDs(f,v,l) is defined to hold if there is an 
executable path \i through Flowchart (S) which starts at start and ends at I such 
that the term A^[/i] e (v) contains /; and the predicate VDDs(f,v,l) is defined to 
hold if for every executable path [i through Flowchart (S) that starts at start and 
ends at I, the term A^[/u] e (v) contains /. 

3. COMPLEXITY RESULTS FOR SCHEMAS HAVING ONLY EQUALITY ASSIGN- 
MENTS 

In this section, we prove that even if we restrict ourselves to the class of schemas 
without concurrency constructs and having only equality assignments, both the 
existential and universal data dependence problems are PSPACE-hard, and become 
NP-hard and co-NP-hard respectively if schemas are also required to be loop-free. 
We also show that if we keep the restriction to equality assignments but allow 
concurrency constructs, and add the further assumption of a constant bound on 
the arity of any predicate symbol, both problems lie in PSPACE. 

3.1 Notational conventions 

— In the proof of Theorems 3.1 and 3.5, we will define schemas without indicating 
labels, and indicate paths simply by using sequences of predicates and end. 
These schemas do not have the concurrency || symbol and hence all vertices in 
the appropriate graph Flowchart (S) lie in Labels (S) U {start, end}. In the cases 
where this convention is used, paths in the sense of Definition 2.2 are defined 
unambiguously. 

— We will need to refer to finite sets of non- negative integers 'without gaps'. Thus 
we define the set 

[m, n] — {m, m + 1, . . . , n} 

for any m < n. 
— In order to save space, we will sometimes abbreviate schemas consisting of se- 
quences of equality assignments by using the quantifier V. For example, in Fig. 
5, the line Vfc G [0,rnj] tk '-—Sj,k is intended as a shorthand for the sequence 

The lines Vj G [l,m] Vfc G [0,m,j] s^ k :=u bad : and Vs G Uje[i,m] F i s: = u good] in 
Fig. 6 have analogous meanings. We only use this notation in cases where the 
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D i — c i,yi,i C i,yi,2 C i,vi,3 

u ^ jifqj()thenui:=ui- 1 ; if y = xj 

where C; „ = < 

\ if q j() then skip else ui:=ui-i; if y = -*Xj 

Fig. 4. The definition of the schema Di used in the proof of Theorem 3.1. 

order of the assignments is immaterial, since no variable occurs on both the left 
side of one assignment in the sequence and the right side of another, and so the 
assignments commute. 
— In Lemma 3.4 and Theorem 3.5, we will define finite state automata for which the 
word 'state' has its usual meaning; however we will also define schemas having 
variables which are the states of the automata, and thus the word state has 
the distinct meaning of a function from variables (automata-theoretic states) to 
elements of the domain (variables, in the case of schemas having only equality 
assignments). This should not cause confusion. 

3.2 NP-hardness of data dependence problems for loop-free schemas without concur- 
rency constructions 

Our main NP-hardness result follows. 

Theorem 3.1. For a schema S, v € V and f G V, the problem of deciding 
3DDs(f, v, end) is NP-hard and that of deciding \/DD${f, v, end) is co-NP-hard, 
even when (in the case of both problems) S is restricted to membership of the class 
of schemas satisfying the following conditions. 

— S has no concurrency or non- deterministic branching constructions and has only 
equality assignments. 

— S contains no loops. 

— Each predicate in S has zero arity. 

Proof. We consider 3DDs first, and then indicate the proof for VDDs- To 
show NP-hardness of deciding whether 3DDs(f,v,l) holds, we use a polynomial- 
time reduction from 3SAT, which is known to be an NP-hard problem [Cook 1971]. 
An instance of 3SAT comprises a set X — {xi, . . . ,x n } and a propositional formula 
P = Nk=i Uk,i V yu.2 V j/fc,3, where each yij is either Xk or -<Xk for some k < n. 
The problem is satisfied if there exists a valuation 8 : X U ~^X — > {T, F} such that 
for each x € X, {8(x),5(-<x)} = {T, F}, under which p evaluates to T. Given this 
instance of 3SAT we will construct a schema S that satisfies the conditions given 
in the statement of the Theorem and contains variables Ubad, uq, . . . ,u n such that 
3DDs(uo, u n , end) holds if and only if p is satisfiable. The schema S is 



Vj e [1, n] Uj := u bad ; Dx . . . D 



in • 



where Di is as defined in Figure 4. Clearly S can be constructed in polynomial 
time from the given instance of 3SAT, as required. 

Assume first that there exists a valuation 5 : X — > {T, F} under which p evaluates 
to T. Define the interpretation i to map qj() to 8(xj) for each qj. Then the path 

ACM Journal Name, Vol. V. No. N, Month 20YY. 



Data Dependence problems for Program Schemas • 9 

7rs(i, e) clearly passes through at least one assignment ui := W/_i; within each Di in 
S, proving 3DDs(uo, u n , end) holds. Conversely, if 3DD s(uq, u n , end) holds, then 
there is an interpretation i such that the path irs{i, e) passes through the sequence of 
assignments u\ :=Mn;, . . . , u n :=u n -i; in turn, and hence passes through ui :=ui-i; 
at least once within each Di . Define the valuation 5 as follows; S(xj ) — T if and only 
if i maps qj() to T. Clearly p evaluates to T. Thus we have proved the Theorem 
for 3DD S . 

To prove co-NP-hardness of deciding the \/DDs relation under the restricted 
conditions given, observe that the final value of the variable u n always lies in 
{u ,u bad } and so 3DD s {u , u n , end) <*=> -^DD s {u bad ,u n , end) holds. Thus 
deciding \/DDs(f,v, end) is co-NP-hard. □ 

3.3 PSPACE-hardness result for data dependence problems for schemas without con- 
currency constructions 

The main theorem of this subsection, Theorem 3.5, uses a polynomial-time reduc- 
tion from the following automata-theoretic problem. 

Definition 3.2. Consider a set of deterministic finite state automata A\, . . . , A m 
for some m > 0, all using an alphabet S. The finite state automata intersection 
problem is that of deciding whether there exists a word in E* that is accepted by 
every automaton Aj. 

Theorem 3.3 [Kozen 1977]. The finite state automata intersection problem is 
PSPACE-complete. 

Given a deterministic finite state automaton A and a member a of its alpha- 
bet, we wish to construct a schema consisting only of a sequence of assignments 
whose variables are the states of A and such that for any transition s ~-> s' in A, 
\/DDs(s', s, end) holds. The schema 

Vfc £ [0, a] t k :=s k : 
Vfc G [0, a] s k :=i x (fc); 

satisfies this requirement if A has state set {soj---> s a} arid its cr-transitions all 
have the form s k ~~> s x {k) f° r a function \ '■ [0, a] — > [0, a], with new variables t k 
disjoint from the variables s/. It may be worth mentioning that the simpler schema 

s :=s x (o); 



does noi satisfy the required data dependence condition because the assignments 
may 'interfere' with one another; for example, if A has only two states sq,si and 
has transitions Si ~~+ sq and sq -^ si, then if 5 is the schema 

«o :=si; 
si :=s ; 

then VDDs(si, si, end) rather than the required VDDs(so, Si, end) holds. Thus it 
is necessary to introduce the 'copying' variables t k . 
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Vfc 6 [0,m,j] t k :=Sj,fc; 

Vfe e [0,mj] Sj, fc :=t Xj (i,fc); 

Fig. 5. The schema Uj i of Lemma 3.4. Here the t^ are new variables used solely for copying and 
the function \j is defined by the state transition function rjj of the automaton Aj as follows; for 
any letter on and state Sj k , rjj(ai,Sj^.) = s j,x (l,k)- Observe that the value defined by a variable 
Sj fc after execution of Uj i is the same as that defined by the variable t]j(oii, Sj } k) before execution, 
since VD-Dr/ , (r)j(on, Sj k), Sj.fciend) holds. 



The motivation for constructing a schema in this way from a given finite state 
automaton is shown by Lemma 3.4. 

Lemma 3.4. Consider a set of m deterministic finite state automata A\, . . . , A m 
for some m > 0, all using an alphabet £ = {a\, . . . ,a n }, with each automaton Aj 
having state set Sj = {sj,o, . . . , Sj. m . } and total transition function r\ 3 ■ : S x Sj — > 
Sj . For each automaton Aj and each letter ai G X, let Ujj be the predicate-free 
schema in Fig. 5 and define Vi = Uij . . . U m .i- Let li, I2, . ■ ■ , l r € [1, n] and define 
7 = cti r ai r _ 1 . . .a h e £*. 

(1) For every j G [l,m] and any s <G Sj, "iDDv l ...v, (Vjili s),s, end) holds. 

(2) Assume each automaton Aj has initial state Sjfi and final state set Fj C Sj. 
Let e final be the state (in the program sense) 

{Sj.k h ^ Ubad s j,k € Sj — Fj 
Sj,k ' ' T-lgood Sj,k t r j 

for new variables Ubad, u gooa <. Then 

M\V h . . . Vi r j efinal (sjfi) - u good 
for all j if and only if the word 7 is accepted by every automaton Aj . 

Proof. (1) can be straightforwardly proved by induction on r. (2) follows 
immediately from (1) using the fact that for any j, Aj accepts 7 if and only if 
Vj(l> s j,o) S Fj holds. □ 

We now give the main PSPACE-hardness theorem of the paper, Theorem 3.5. 
The proof of this Theorem will construct a schema in which solving an existential 
data dependence problem corresponds to solving a given instance of the finite state 
automata intersection problem. Parts of the schema constructed will 'simulate' 
state transitions of the automata. 

Theorem 3.5. For a schema S, v G V and f G V, the problems of deciding 
whether 3DDs(f,v, end) and \/DDs(f,v, end) hold are both PSPACE-hard, even 
when S is restricted to membership of the class of schemas satisfying the following 
conditions. 

— S has no concurrency or non- deterministic branching constructions and has only 

equality assignments, 
— No predicate occurs more than once in S . 
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— S contains two while predicates, one of which lies in the body of the other. 

Proof. We consider 3DD$ first, and then indicate the proof for VDD$- We 
prove the Theorem using a reduction from the intersection problem for finite state 
automata, given in Definition 3.2, which is PSPACE-complete by Theorem 3.3. 
Thus we assume an instance of this problem comprising a set of m deterministic 
finite state automata Ai,...,A m for some m > 0, all using an alphabet S = 
{ai, . . . ,a„}, with each Aj having state set Si = {sjfi, ■ ■ ■ i s j,m,j}, total transition 
function rjj : £ x Sj — > Sj, initial state Sj t o and final state set Fj C Sj, as in the 
statement of Lemma 3.4. The problem is satisfied if there is a word in E* which is 
accepted by every automaton Aj. 

Given these automata, consider the schema S given in Fig. 6. Clearly S satisfies 
the conditions listed in the statement of the Theorem and S can be constructed 
in polynomial time from the set of automata Aj as input. We now show that 
3DD s (u good , a m , end) holds if and only if the intersection of the acceptance sets of 
all the automata Aj is non-empty, thus proving the Theorem. 

Vj e [l,m] aj :=u bad ; 
Vj e [l,m] bj :=u bad ; 
while Qi(a m ) { 

Vj e [l,m] a 3 :=u bad ; 

Mj 6 [l,m- 1] bj-.= u bad ; 

c:=u bad \ 

Vj e [l,m] Vfc € [0,mj] s jtk :=u bad ; 
ifQl{bm) then c:=u good ; 



else 


{ 

Vse Uje[i,m] F j s -= u good\ 

while Qz{s\fi,. .. ,s m ,o) T„ 
} 


then 
else 


ai :=b m ; 
b\ :=c; 


then 
else 


a 2 :=ai; 
62 :=6i; 



»/pi(si,o) 

ifP2(s2fi) 



if Pm(s m ,o) then a m ~a m -V, 

else bm-=b m -i; 



Fig. 6. The schema S used in the proof of Theorem 3.5. The schema T n is defined in Fig. 7. 



(<=)■ Assume first that there is a word 7 = a c i z a c i z _ 1 ■ ■ ■ ttdi that is accepted by 
every automaton Aj, for minimal z. We will prove that 3DDs(u goo d, a m , end) 
holds. Define the interpretation i on the predicates Qi,Q2,Q3 and each pj as 
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Ti = ifqi(s) then V t 

else Ti-i 

T 1 = V 1 

where V t = U u . . . U ml 



Fig. 7. The recursive definition of the schema 7). Here s is a vector whose entries are all the 
variables Sj k, in any fixed order, and Uji is the schema in Fig. 5. Observe that an execution of 
T n entails an execution of one schema Vj, for some I £ [l,n]. 



follows. 

' Q\{Ubad) ^ T, Ql(u good )^F 
Q2(ubad) !->■ T, Q 2 (Ugood) I-* F 

Q 3 (t>i, . . . , u m ) i-4 F iff every Uj = u good 

^Pj(ubad) 1-4 F, Pj(Ugood) •->• T 

We now indicate how i is defined on the predicates <#. Define the path 

M = Qz1n<ln-1 ■ ■ ■qd 1 Q3q n q n -l ■ ■ ■ 1d 2 Qi ■ ■ -QsQnQn-l ■ ■ ■ Qd z <?3 

e n( while (Q3(si,o, • • • , s m ,o)) T n ). 

We wish 7rs(i, e) to follow the path /xpi whenever it encounters 
w/iite (^3(^1 0) ■ • ■ ; s m,o)) T n ), in effect executing the schema V dl . . . Vd z ■ We now 
show that this is possible. First observe that by Part (2) of Lemma 3.4 applied to 
the suffices of 7, every variable s^n defines the value u good at the last occurrence 
of Q3 along n, but this does not hold at any earlier occurrence of Q3, since 
this would imply that a strict suffix of 7 was accepted by every automaton Aj , 
contradicting the minimality of z. Thus the definition of i on Q3 given above 
ensures that 775(1, e) follows the path fipi where required, provided that i can 
defined appropriately on each predicate qi. 

Suppose that this is impossible; that is, that there is a repeated (//-predicate term 
along \x for some qi, which i would have to map to both T and F. Thus we can 
write /i = /1' '<#// 'qin'" such that every variable Sj : k defines the same value at 
the two occurrences of qi. Assume that Q3 occurs z' times in // and z" times 
in //'; clearly z" > 1. Since no variable apart from the variables 8j k occurs in 
the while schema guarded by Q3, every variable Sj,n defines the same value after 
the path fi'qifi" as after /i, namely u goo d- Thus by Part (2) of Lemma 3.4, the 
word ad z (Xd z -! ■ ■ ■ <Xd z , +z i, ■ ■ ■ a d z ,_ 1 ■ ■ ■ <^d x is accepted by every automaton Aj, 
contradicting the minimality of z. 

Thus we have shown that the interpretation i can be defined so that ns(i, e) 
always follows the path \x whenever while (^(si.Oj • ■ • , s m ,o)) T n is reached, and 
furthermore, every variable Sjfi defines the value u gooc [ at the end of /i, and so p\ 
is the next symbol though which ivs(i, e) passes. 

We now prove that A / f[>S']g(a m ) = u gooc i holds. The definition of i on Q\ ensures 
that TTs(i, e) passes at least once through the body of Qi, and since i maps Q2(b m ) 
to T and each Pj(iibad) to F, on the first passing of Trs(i, e) through the body of 
Qi, the assignment c:=u goo d', and all assignments to every bj occur, and hence 
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b m defines the value u goo d when Qi is reached for the second time along ns(i, e). 
Since i maps Qi(ub a d) to T, the path ws(i, e) then enters the body of Q\ a second 
time, and since i maps Q2(u goo d) to F, this time irs(i,e) passes through Q3. 
As proved above, ns(i,e) terminates within while (<23(si,o, ••■, s m fi))T n ) and 
every sj,o defines u g0 od when irs(i, e) then reaches pi, and so 7rs(i, e) then passes 
through all the assignments ai :— b m ; and dj := Oj-i;, after which a m defines the 
value Ug 00 d- Since i maps Q2(a m ) to F, 3DDs(u goo d, dm, end) holds, as required. 
— (=>). Conversely, suppose that 3DDs(u goo d,o,m, end) holds. Thus A4[5]e(a m ) = 
u g ood holds for some interpretation i. The only sequence of assignments which 
could copy u g ood at the start of S to a m at the end consists, in order, of the 
assignment c := u goo d\ and those referencing every bj for j < m followed by those 
referencing b m and every cij for j < m, and so 7rg(i, e) must pass through all of 
these in turn. Furthermore, owing to the assignments setting c and 61, ... , b m -i 
to Ubadi the assignments referencing c and every 6j for j < m must occur in 
a single passing through the body of Qi, during which every Sjfi defines u oa d 
when pj is reached. Thus i must map every Pj(ub a d) to F. Similarly, owing to 
the assignments a,j := ui, a d;, the assignments referencing every a,j for j < m must 
also occur in a single passing through the body of Q\, and so the predicate term 
defined by each Pj{sjfi) must map to T, and so every Sjfi must define a value 
distinct from Ubad simultaneously. The only possibility is u goo d, and so at some 
point the path tts (i,e), must reach p\ with each Sjfi defining u goo d, and thus 
must have passed through Q 3 since the last occurrence of Q 2 - Let Vd x ■ ■ ■ Vd z be 
the sequence of schemas V& occurring on Tvg(i,e) since this occurrence; then by 
Part (2) of Lemma 3.4, the word a-d z ®d z -i a di is accepted by every automaton 
Aj, as required. 

To prove PSPACE-hardness of deciding the VDDs relation, observe that the final 
value of the variable a m always lies in {u goo d, u oa d} and so 3DDs(u goo d, «m, end) ^=^ 
-NDD s {u oa d,a m ,eTnd) holds. Thus deciding \/DDs(f, v, end) is co-PSPACE-hard 
and hence PSPACE-hard. □ 

3.4 Membership in PSPACE of data dependence problems for the class of schemas 
having a bound on the arity of all predicates and having only equality assignments, 
but without restrictions on concurrency constructs 

In order to prove that our problems lie in PSPACE, we need to show that the 
successors of a vertex in Flowchart (S) can be enumerated in polynomial time. This 
motivates Theorem 3.6. 

Theorem 3.6. Let S be a schema. 

(1) The vertices of Flowchart (S) can be encoded as words in the alphabet Labels(S)U 
{start, end} in which no element of Labels (S) occurs more than once and start 
and end each occur not more than \ Labels (S)\ times. 

(2) Given any V G Vertices [Flowchart (S)), the set of all I" G Vertices{Flowchart{S)) 
for which {l',l") is an edge in Flow chart {S) , and the corresponding values of 
edgeType(l' ,1"), can be computed in polynomial time. 

Proof. (1) We indicate the encoding by assuming that S has the form S = I : 
SiHiS^H • • • \\S m ; the encoding in the case of the other constructions given in 
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Definition 2.2 is straightforward to infer. In the concurrent case, Flowchart(S) 
has vertex set x^ 1 (Vertices(Flowchart(Si)) U {start, I, end} and a vertex of 
Flowchart (S) can be encoded either by an element of {start, Z, end} (repre- 
senting themselves) or by a word w — w\ . . . w m , where each Wi represents an 
element 

h G Vertices(Flowchart(Si)) and w represents (h, . . . ,l m ). The conditions 
given on the frequency of letters in w follow easily from those for each Wi and 
the fact we assume that no label occurs more than once in 5*. 
(2) This follows easily by induction on the structure of S, using the encoding given 
in Part (1) of this Theorem. □ 
Our other main theorem of this Section follows. 

Theorem 3.7. Let S be a schema and let d£V, let I be a vertex of Flowchart (S) 
and let f G V. Assume that all assignments in S are equality assignments. Assume 
that there is a constant upper bound on the arity of any predicate symbol occurring 
in S. Then the problems of deciding whether 3DDs(f,v,l) or \/DDg{f,v,l) hold 
both lie in PSPACE. 

PROOF. Wc first prove decidability of 3DD s (f, v, I) in PSPACE. We do this by 
constructing the following algorithm, which lies in NPSPACE. We non-deterministically 
guess a path beginning at start through the schema S that realises the copying 
of the initial value of the variable / onto v at the vertex I. At each point in the 
algorithm we store not just the vertex and the state (with the domain restricted to 
the set of variables referenced in S) reached, but also a finite, initially empty set 
of equations of the form p(y) = X for predicate p occurring in 5, variable vector y 
whose components are referenced in 5* and X G {T, F}. If n is an upper bound on 
the total number of predicates and variables occurring in S and b is the assumed 
constant upper bound on the arity of any predicate in the class of schemas under 
consideration, then the number of equations of this form is bounded by 2n b+1 and 
thus the data stored at any point in the execution of the algorithm is polynomially 
bounded. 

Whenever the algorithm crosses an edge (l',l") in Flowchart(S) satisfying 
edgeType S {V ,1") — (q, x, X), the equation q(y) = X is added to the set, where the 
vector y = .M[/i] e x, with /i being the path traced by the algorithm up to the vertex 
V . No equation is added to the set when an edge for which edgeType s returns e or 
an assignment is crossed. Thus this equation set encodes the set of interpretations 
which are compatible with the path followed, in the sense that an interpretation i 
is compatible with this path if and only if p(y) = X is a consequence of i for all 
equations p(y) = X in the set. 

The algorithm terminates and returns false if the equation set acquires a pair 
of contradictory equations (that is, a pair p(w) = T, p(w) = F) at any point. It 
terminates and returns true if I is reached with the state mapping v to / without 
two contradictory equations having occurred in the set. By Theorem 3.6, this 
algorithm lies in NPSPACE. Since PSPACE = NPSPACE holds, the problem of 
deciding 3DD s (f, v, I) is thus in PSPACE. 

To prove decidability of VDD s {f, v,l) in co-NPSPACE = PSPACE instead, we 
modifiy the algorithm as follows; termination with output true occurs if I is reached 
with the state not mapping v to /. □ 
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4. COMPLEXITY RESULTS FOR FREE SCHEMAS 

If we allow assignments with function symbols, and not just equalities, to occur in 
schemas, then deciding data dependence becomes harder, and the proof of member- 
ship in PSPACE for both problems in Theorem 3.7 does not appear to generalise. 
However, under restriction to the class of free schemas, we prove in Theorem 4.5 
that deciding existential data dependence is PSPACE-complete, using Mullcr-Olm's 
result [Muller-Olm and Seidl 2001] for non-deterministic programs. Additionally, 
we prove in Theorem 4.11 that under the further condition that a constant bound 
is placed on the number of subschemas occurring in parallel, this problem becomes 
polynomial-time decidable. 

Recall that a schema is free if every path through its flowchart is executable. As 
an example, the schema 

while q(z) do z := h(z)\ 

(we have omitted labels from its definition) is free, whereas while q{z) do z := g(); 
is not free, since there is no interpretation and initial state such that the path so 
defined enters the body of q exactly once. 

4.1 PSPACE-completeness of the existential data dependence problem for free schemas 

Theorem 4.5 is the main result of this subsection. 

Lemma 4.1. Given any schema S without predicates, a variable v and f G VUJ 7 , 
the problem of deciding whether 3DDs{f,v,end) holds is PSPACE-hard. 

Proof. This is [Muller-Olm and Seidl 2001, Theorem 2]. □ 

Lemma 4.2. Given any free schema S, a vertex I in Flow chart (S) , a variable v 
and f £ VU J- , with I, v and f all occurring in S , there exists a free schema S' which 
does not contain any loop or U constructions, and such that 3DD s(f, v, I) holds if 
and only if 3DDs'{f,v,l) does. Furthermore, S' can be constructed in polynomial 
time from S . 

Proof. Given S, we replace loop or U constructions with while and nested if 
statements respectively, in the following way. Let z be a variable not occurring in S 
and not equal to v or /, let h be any function symbol and let q be any predicate sym- 
bol. Suppose that m : loop T occurs in S; then we replace it by ml : z := h(z); to : 
while q{z) do {m" : z := h(z); T}, for new labels to', to". Similarly, an occurrence of 
to : T n U . . . U T\ in S can be replaced by the schema m : P n , where we recursively 
define P\ = z:=h(z);T\ and P r = if q(z) then z:=h(z);T r else z:=h(z);P r —i for 
r > 1, where we have omitted labels in the definitions of each P r . Let S' be the 
schema obtained from S after all the loop or U constructions have been replaced. 
Since z is never referenced in the original schema S, the new assignments to z can- 
not interfere with the existing data dependence relations in S, and the length of 
any term defined by z along a path through 5" must successively increase at each 
assignment to z, hence the introduction of the new while and if statements cannot 
cause repeated predicate terms to occur. Thus S' is free if S is. There is a natural 
correspondence between paths in S and in S", and thus 3DDs(f,v,l) holds if and 
only if 3DDs> (/, v, I) follows. Also, S' can be constructed in polynomial time from 

5, proving the Lemma. □ 

ACM Journal Name, Vol. V, No. N, Month 20YY. 



16 • Sebastian Danicic et al. 

Definition 4.3. Given a schema S, 1,1' G Vertices (Flowchart (S)) and variables 
v,v', we define the relation (I, v)**>(l', v') to hold if either edgeType(l,l') is an as- 
signment to v' that references v, or v = v' and edgeType(l, V) is not an assignment 
to v'. 

Lemma 4.4. For any free schema S, a vertex I in Flow chart ( S) , a variable v and 
f G T, 3DDs(f,v,l) holds if and only if there exist m,n G Vertices(Flowchart(S)) 
and a variable w such that edgeType(m,n) is an assignment to w with function 
symbol f and (n,w)^+*(l,v) holds. 

Proof. This follows immediately from the definition of 3DDs(f, v,l). □ 

Theorem 4.5. Given any free schema S, a vertex I in Flow chart (S) , a variable 
v and f G V U J- , the problem of deciding whether 3DDs(f, v, I) holds is PSPACE- 
complete, and is PSPA CE-hard even if I = end and S does not contain any loop 
or U symbols. 

Proof. The PSPACE-hardness result follows immediately from Lemmas 4.1 and 
4.2. 

To show membership in P SPACE, we first assume that / G J 7 , since if / G V 
then we can replace S by the schema S' = f := g(); S for a function symbol g 
not occurring in S, for then 3DDs(f,v,l) ^=> 3DDs>(g, v J) holds, and S' 
can be constructed in polynomial time from the input. The result then follows 
from Lemma 4.4 as follows. We non-deterministically guess an edge (to, n) in 
Flowchart (S) and a variable w such that edgeType(m,n) is an assignment to w 
with function symbol / and then decide whether (n, w)~^>*(Z, v) holds. This can 

be done by guessing a path from (n,w) to (l,v) in the digraph whose vertices are 
pairs {l',v') for I' G Vertices(Flowchart(S)) and variables v' occurring in S and 
whose edges are given by the ~-> relation. At any point in the algorithm, only the 

current pair (I 1 ,v') is stored, rather than the entire graph. By Theorem 3.6, only 
polynomial space in the input is required for this, thus proving that the problem 
lies in NPSPACE = PSPACE. □ 

4.2 Polynomial-time complexity of the existential data dependence problem for the 
class of free schemas with a bound on the number of concurrency constructs 

We now consider the existential data dependence problem in which a constant 
upper bound is placed on the number of occurrences of || in the schemas. Owing 
to the freeness assumption on the class of schemas under consideration, 3DD$ 
can be defined by an iterative data flow analyis. Lemma 4.7 provides the crucial 
result in showing that in this case, the problem is polynomial-time bounded. This 
result relies on Lemma 4.6, which follows from the inductive definition of a schema 
flowchart in Definition 2.2. 

Lemma 4.6. Let B be a non-negative integer and suppose that there are non- 
decreasing functions Pb ■ N — > N satisfying the following conditions. 

(0). P B+ i(n) > P B (n) ifn>\. 
(1). P B (n) >3 ifn> 1. 
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(2, 3, 3') . P B (m + ... + n m ) > P Cl (m) + . . . + P Cm (n m ) + 1 if m > 2, d + . . . + 
C m — B and n, > IVi. 

(4,4'). P B (n + l)>P B (n) + l ifn>l. 

(5). P B (ni + ... + n m ) >Pc7 1 (ni)...P C 7 m (n m ) + 3 ifd + . . . + C m = B-m + l 
and B > m — 1 > 1 and ni > l\/i. 

Then for every schema S encoded by a word of length n, in which \\ occurs not more 
than B times, Flowchart(S) has not more than P B (n) vertices. 

Proof. This follows by induction on the structure of S. Each Condition in the 
statement of the Lemma apart from (0) is labelled with the number of the case 
in Definition 2.2 that requires it. As an example, consider Condition (5). Assume 
that S — I : 5i ||52 1| • • • \\S m ; then Flowchart(S) has vertex set 
x™ x ( Vertices (Flowchart (Si)) U {start, I, end}. Assume that || occurs not more 
than B' times in S and exactly Ci times in each Si. Define B = C\ + . . .+C m +m— 1. 
Suppose each schema Si is encoded by a word of length n» and S is encoded by a 
word of length n, then 

n> ni + . . . + n m 

holds. By the inductive hypothesis, each Flowchart(Si) has not more than Pc^n,) 
vertices. Hence Flowchart(S) has not more than Pc 1 (n\) . . . Pc m (n m ) + 3 vertices, 
and hence by (5) and the monotonicity Condition (4), not more than P B (n) vertices. 
Thus since clearly B' > B holds, it follows from (0) that Flowchart (S) has not more 
than P B i (n) vertices, proving the Lemma in this case. Other cases are treated 
analogously. □ 

Lemma 4.7. Given any integer B > 0, let \b be the set of all schemas in which 
|| occurs not more than B times. Then there exists an algorithm that when given a 
schema S in \B as input, constructs the graph Flowchart (S) and is polynomial-time 
bounded. 

Proof. For each B > 0, it suffices to prove that the set containing 
| Vertices (Flow chart (S))\ for every schema S in \ B is polynomially bounded in 
terms of the number of letters needed to encode S. The conclusion of the Lemma 
then follows from Part (2) of Lemma 3.6. Consider the functions P B '■ n n- 
raax(3,n 6 ( B+1 )). We will show that they satisfy Conditions (0-5) of Lemma 4.6, 
and hence that P B (n) is an upper bound for the number of vertices in Flowchart (S) 
for any schema in xb encoded by a word of length n. The existence of the polyno- 
mial bound required will follow immediately. 

Clearly the functions P B satisfy Conditions (0, 1,4,4'). We now prove that they 
satisfy Condition (2,3,3') under the stated assumptions. Observe that 

P B ( ni + ...+ n m ) = (m + . . . + n m ) 6 ( B+1 ) = ((£ n,) 2 ) 3 ^ 1 ) 
> (J2 n 2 i + 2n 1 n 2 ) 3 ( B+1 ) > (£ n 2 + 1) 3 ( B+1 ) + 1 > ]T nf B+1) + 3m + 1 

i<m i<m i<rn 
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(since each n, > 1 and m > 2) 



>E P ^) + 1 



6(C 3 + 1) 



since each d < B. It now remains to prove (5). We have 

P B (m + . . . + n m ) = (m + . . . + n m ) 6 ( B+1 ) = ft E n *< 

(si^e E,< m (Q + l) = 5 + 1) 

> JJ ((max 4 < m n, + 1) 2 )»(^+ 1 ) > [] (max 4 < m n 2 + 2 + l) 3 ^ 1 ) 

j < m j < m 

(since each n^ > 1 and m > 2) 

> n (^«^<™ »? + 2) 3(cj+i) + 3 > n k 6(Cj+i) + 23 ) - n p *(%) + 3 > 

j<'m j < m j < m 

thus proving the Lemma. □ 

Definition 4.8. Let S 1 be a schema. We define the set Wg to be the subset of 
(V U J-) x V for which both components occur in S. 

Definition 4.9 recursive definition of 3DatDep s for a schema S. Let 5* be a schema. 
Then 3DatDep s is the function H from Ws x Vertices (Flowchart (S)) to {T, F} sat- 
isfying the following 

(1) H(v,v, start) = T for all (v,v) G Ws. 

(2) If w is a variable, (I, I') is an edge in Flowchart(S) and edgeType(l, I') is not an 
assignment to the variable w, then H(f, w, I) = T =*> iJ(/, w, £') = T holds. 

(3) If x, y £ V and (I, I') is an edge in Flowchart(S) and edgeType(l, I') is an assign- 
ment to the variable y that references x, then -ff (/, x, /) = T =>■ #(/, y, /) = T 
holds. If in addition, the assignment assign g (I, I 1 ) has function symbol h, then 
H(h, y, I) = T holds, 

for which the set -ff -1 (T) is minimal. 

Theorem 4.10. Let S be a free schema and let (f,v) G Ws- Let 
IE Vertices(Flowchart(S)). Then3DDs(f,v,l) <==^ 3DatDep s (f 1 v 1 l) holds. 

Proof. Define the function K : Ws x Vertices (Flow chart (S)) — > {T, F} as fol- 
lows; K(f,v,l) — T if and only if there is a path /i through S from start to I 
such that the term .M [/z] e (t>) contains /. Since S is free, 3DDs = K holds. Thus 
it suffices to show that K = 3DatDep s holds, and this follows from the fact that 
Definition 4.9, with K in place of 3DatDep s , gives an equivalent definition of K. □ 

The main Theorem of this subsection follows. 

Theorem 4.11. Let B > and let S be a free schema in which every || con- 
struction occurs not more than B times, and let f G V U T and v G V. Let 
I G Vertices (Flowchart (S)) . Then it can be decided in polynomial time whether 
3DD s (f,v,l) holds. 
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Proof. From Theorem 4.10 it suffices to prove that it can be decided in poly- 
nomial time whether 3DatDep s (f, v, I) holds, under the restriction given on || con- 
structions. We compute 3DatDep s (f, v, I) as follows, using the graph Flowchart(S). 
We may assume that (/, v) G Ws, since otherwise 3DDs(f,v,l) can clearly be de- 
cided in polynomial time. 

We approximate 3DatDep s on the domain Ws x Vertices (Flowchart (S)) by a 
sequence of functions iJi,i?2, ••• : W —> {T,F}. Firstly, let Hi satisfy Condition 

(1) of Definition 4.9 for every (v, v) e Ws and let H\(f, v, I) = F whenever (/, v, I) ^ 
(v, v, start). Given a function Hi that does not satisfy every instance of Condition 

(2) or (3) of Definition 4.9, obtain the function Hi+\ by altering Hi on one such 
instance, so that H~ +l {T) contains every element of H~ (T), plus an additional 
one. Therefore a maximal function H n is eventually reached with n < Ws x 
Vertices (Flowchart (S)), which is polynomially bounded in terms of S, by Lemma 
4.7. In addition, each function Hi can be encoded by listing the elements of i/~ (T), 
thus H n is computable in polynomial time. By induction on i, every set H~ (T) C 
3DatDepg 1 (T), and H n satisfies all three conditions in Definition 4.9, hence the 
minimality condition in the definition of 3DatDep s implies H n = 3DatDep s , thus 
proving the Theorem. □ 



5. CONCLUSIONS 

We have extended conventional data dependency problems to arbitrary schemas 
and have shown that both the existential and universal data dependence problems 
lie in PSPACE for schemas without concurrency constructs and having only equal- 
ity assignments, provided that there is a constant upper bound on the arity of any 
predicate symbol occurring in the schemas. We have also shown that without this 
upper bound, both problems are PSPACE-hard. This PSPACE-hardness result, 
Theorem 3.5, entails constructing a schema without this arity restriction; see the 
predicates Qz and qi in Figs. 6 and 7. This suggests that assuming this restric- 
tion may result in a lower complexity bound than PSPACE. Since schemas with 
predicates approximate the behaviour of real programs much more accurately than 
wholly non-deterministic programs which arc normally used in program analysis, a 
reasonable class of schemas for which our two problems could be decided tractably 
would be of considerable interest. 

In addition, we have proved that for free schemas, existential data dependence 
is decidable in polynomial time provided that a constant upper bound is placed 
on the number of occurrences of || in the schemas being considered. We have not 
attempted to prove an analogous result for the universal data dependence relation. 
This would be an interesting subject for future investigation. 

As mentioned in the Introduction, many concurrent systems have only relatively 
few threads even if they have many lines of code, and therefore the bound on the 
number of occurrences of || is not particularly restrictive. The freeness hypothesis 
(equivalent to assuming non-deterministic branching) is common in program anal- 
ysis, and its use ensures that no false positives for data dependence are computed. 
This is important in areas such as security where we wish to show that the value 
of one variable x, whose value is accessible, cannot depend on the value of another 
variable y whose value should be kept secret. 
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